A New Margin-Based Criterion for Efficient Gradient Descent
نویسندگان
چکیده
During the last few decades, several papers were published about second-order optimization methods for gradient descent based learning algorithms. Unfortunately, these methods usually have a cost in time close to O(n) per iteration, and O(n) in space, where n is the number of parameters to optimize, which is intractable with large optimization systems usually found in real-life problems. Moreover, these methods are usually not easy to implement. Many enhancements have also been proposed in order to overcome these problems, but most of them still cost O(n) in time per iteration. Instead of trying to solve a hard optimization problem using complex second-order tricks, we propose to modify the problem itself in order to optimize a simpler one, by simply changing the cost function used during training. Furthermore, we will argue that analyzing the Hessian resulting from the choice of various cost functions is very informative and could help in the design of new machine learning algorithms. For instance, we propose in this paper a version of the Support Vector Machines criterion applied to Multi Layer Perceptrons, which yields very good training and generalization performance in practice. Several empirical comparisons on two benchmark data sets are given to justify this approach.
منابع مشابه
A New Hybrid Conjugate Gradient Method Based on Eigenvalue Analysis for Unconstrained Optimization Problems
In this paper, two extended three-term conjugate gradient methods based on the Liu-Storey ({tt LS}) conjugate gradient method are presented to solve unconstrained optimization problems. A remarkable property of the proposed methods is that the search direction always satisfies the sufficient descent condition independent of line search method, based on eigenvalue analysis. The globa...
متن کاملAn eigenvalue study on the sufficient descent property of a modified Polak-Ribière-Polyak conjugate gradient method
Based on an eigenvalue analysis, a new proof for the sufficient descent property of the modified Polak-Ribière-Polyak conjugate gradient method proposed by Yu et al. is presented.
متن کاملA new Levenberg-Marquardt approach based on Conjugate gradient structure for solving absolute value equations
In this paper, we present a new approach for solving absolute value equation (AVE) whichuse Levenberg-Marquardt method with conjugate subgradient structure. In conjugate subgradientmethods the new direction obtain by combining steepest descent direction and the previous di-rection which may not lead to good numerical results. Therefore, we replace the steepest descentdir...
متن کاملA Note on the Descent Property Theorem for the Hybrid Conjugate Gradient Algorithm CCOMB Proposed by Andrei
In [1] (Hybrid Conjugate Gradient Algorithm for Unconstrained Optimization J. Optimization. Theory Appl. 141 (2009) 249 - 264), an efficient hybrid conjugate gradient algorithm, the CCOMB algorithm is proposed for solving unconstrained optimization problems. However, the proof of Theorem 2.1 in [1] is incorrect due to an erroneous inequality which used to indicate the descent property for the s...
متن کاملA New Conjugate Gradient Method with Guaranteed Descent and an Efficient Line Search
A new nonlinear conjugate gradient method and an associated implementation, based on an inexact line search, are proposed and analyzed. With exact line search, our method reduces to a nonlinear version of the Hestenes–Stiefel conjugate gradient scheme. For any (inexact) line search, our scheme satisfies the descent condition gT kdk ≤ − 7 8 ‖gk‖. Moreover, a global convergence result is establis...
متن کامل